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Foreword 



rd , 



This Technical Specification has been produced by the 3 Generation Partnership Project (3GPP). 

The contents of the present document are subject to continuing work within the TSG and may change following formal 
TSG approval. Should the TSG modify the contents of the present document, it will be re-released by the TSG with an 
identifying change of release date and an increase in version number as follows: 

Version x.y.z 

where: 

X the first digit: 

1 presented to TSG for information; 

2 presented to TSG for approval; 

3 or greater indicates TSG approved document under change control. 

y the second digit is incremented for all changes of substance, i.e. technical enhancements, corrections, 
updates, etc. 

z the third digit is incremented when editorial only changes have been incorporated in the document. 
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Scope 



The present document contains an electronic copy of the ANSI-C code for DSR Extended Advanced Front-end. The 
ANSI-C code is necessary for a bit exact implementation of DSR Extended Advanced Front-end. 



References 



The following documents contain provisions which, through reference in this text, constitute provisions of the present 
document. 

[1] ETSI ES 202 050: "Distributed Speech Recognition; Advanced Front-end Feature Extraction 

Algorithm; Compression Algorithm", Oct 2002. 

[2] ETSI ES 202 212 "Distributed Speech Recognition; Extended Advanced Front-end Feature 

Extraction Algorithm; Compression Algorithm, Back-end Speech Reconstruction Algorithm", 

Nov 2003. 

[3] 3GPP TS 26.177: "Speech Enabled Services (SES); Distributed Speech Recognition (DSR) 

extended advanced front-end test sequences". 



3 Definitions and abbreviations 

3.1 Definitions 

Definition of terms used in the present document, can be found in [1], [2] 

3.2 Abbreviations 

For the purpose of the present document, the following abbreviations apply: 

ANSI American National Standards Institute 

I/O Input/Output 

RAM Random Access Memory 

ROM Read Only Memory 

AFE Advanced Front-end 

X-AFE extended Advanced Front-end 

DSR Distributed Speech Recognition 



C code structure 



This clause gives an overview of the structure of the bit-exact C code and provides an overview of the contents and 
organization of the C code attached to this document. 

The C code has been verified on the following systems: 

Sun Microsystems workstations and GNU gcc compiler 

IBM PC compatible computers with Linux operating system and GNU gcc compiler. 

ANSI-C was selected as the programming language because portability was desirable. 

4.1 Contents of the C source code 

The distributed files with suffix "c" contain the source code and the files with suffix "h" are the header files. 
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Makefiles are provided for the platforms in which the C code has been verified (listed above). 



4.2 Program execution 



There are separate executables for the FrontEnd and Vector Quantization, with and without Extensions. The command 
line options are described below. 

<> - indicates parameters for the given option for running the executable 
- indicates default parameter. 

FrontEnd w/ Extension: 

USAGE: bin/ExtAdvFrontEnd infile HTK_outfile pitch_outfile class_outfile [options] 

OPTIONS: 

-q Quiet Mode (FALSE) 

-F format Input file format <NIST,HTK,RAW> (NIST) 

-fs freq Sampling frequency in kHz <8,16> (8) 

-swap Change input byte ordering (Native) 

-noh No HTK header to output file (FALSE) 

-nocO No cO coefficient to output feature vector (FALSE) 

-nologE No logE component to output feature vector (FALSE) 

-skip_header_bytes n - Skip header, first n bytes ( Only for -F RAW) 

-noh, -nocO, -nologE and -skip_header_bytes are not used and should not be changed. 

FrontEnd w/o Extension: 

USAGE: bin/AdvFrontEnd infile HTK_outfile [options] 
OPTIONS: - Same as FrontEnd w/ Extension 

Vector Quantization w/ Extension: 

Usage: extcoder htk_file_in pitch_file_in class_file_in bitstream_file_out pitch_file_out txt_file_out -freq x - 

VAD/No_VAD 
htk_file_in Input mel-frequency cepstral coefficient file in HTK MFCC format. 

pitch_file_in Input pitch period file. 
class_file_in Input classification file. 
bit_file_out Output binary bitstream. 

pitch_file_out Output quantised pitch period file. 
txt_file_out Vector quantiser output in text format, 

-freq x Sampling frequency in kHz (8 or 16). 

-VAD Use voice activity detector data. Voice activity input file must have same name as htk_file, but 

extension .vad 
-No_VAD Do not incorporate voice activity detector information in output bitstream. 

Vector Quantization w/o Extension: 

Usage: coder htk_file_in bitstream_file_out txt_file_out -freq x -VAD/No_VAD 

htk_file_in Input mel-frequency cepstral coefficient file in HTK MFCC format. 

bit_file_out Binary output bitstream. 

txt_file_out Vector quantiser output in text format. 

-freq x Sampling frequency in kHz (8 or 16). 

-VAD Use voice activity detector data. Voice activity input file must have same name as htk_file, but 

extension .vad 
-No_VAD Do not incorporate voice activity detector information in output bitstream. 

File extension descriptions as generated by the sample script: 

.cep - Binary file containing cepstral features in HTK format. Output from the FrontEnd, input to the vector quantizer, 
.pitch - Binary file containing pitch information. Output from the FrontEnd, input to the vector quantizer. Only used for 

Extension, 
.class - Ascii file containing class information. Output from the FrontEnd, input to the vector quantizer. Only used for 

Extension. 
.bs - Binary file containing the bitstream. Output from the vector quantizer, 
.log - Log files from the different executables. 
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4.3 Code hierarchy 

Tables 1 to 3 are call graphs that show the functions used for AFE (table 1), VQ (table 2), and Extension (table 3). 

Each column represents a call level and each cell a function. The functions contain calls to the functions in rightwards 
neighboring cells. The time order in the call graphs is from the top downwards as the processing of a frame advances. 
All standard C functions: printf(), fwrite(), etc. have been omitted. Also, no basic operations (add(), L_add(), mac(), 
etc.) or double precision extended operations (e.g. L_Extract()) appear in the graphs. 

The basic operations are not counted as extending the depth, therefore the deepest level in this software is level 7. 
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Table 1 : AFE call structure 



|main() 



AdvProcesslnit_B() 



DoNoiseSuplnit_B() 



DoWaveProclnit_B() 



DoCompCep5lnit_B() 



DoPostProclnit_B() 



DoVADInit_F() 



Do16kProclnit_B() 



QMF_FIR_lnit_B() 



I AdvProcessAlloc_BO~ 



I FlustiAdvProcess_B()~ 



|AdvProcessDelete_B()~ 



I DoAdvProcess_B() 



firJnitializationBQ 



DP_HPJilter5_B() 



Butln32Alloc() 



DoNoiseSupAlloc_B() 



DoWaveProcAlloc_B() 



DoCompCepsAlloc_B() 
DoPostProcAlloc_B() 



DoVADAIIoc_F() 



Do16kProcAlloc_B() 



DoVADFIush_F() 



CvFeatlnt2Float() 



DoNoi5eSupDelete_B() 



DoWaveProcDelete_B() 
DoCompCepsDelete_B() 
DoPostProcDelete_B() 
DoVADDelete_B() 



Bufln32Free() 



Pol 6kProcessing_B() 



DoNoiseSup_B() 



Get1 6k_p_bufferData1 6k_B() 



Get1 6k_bufData1 6kSize_B() 



Get16k_p_BandsForCoding16k_B() 



Get1 6k_p_GodeForBands1 6k_B() 



Get16k_dataHP_B() 



VAD_F() 



Log_2() 



DoSigWindowing16 F1() 




DoSigWindowing16 F2() 




ff4NRFix32 B{) 






GetL15() 




GetH15() 




Mult16x32() 




Add Mult16x16 160 




Sub Mult16x16 160 




PermutO 


FFTtoPSD F() 






Square24d2 B() 




Square24 B() 


Get16k BFC dec B{) 




GetBandsForCoding16k B() 




PSDMean F() 




NoiseEstimation F1 (} 






Sqrt 20 




Sqrt16 2() 


NoiseEstimation F2() 






Sqrt 2() 




Sqrt16 2() 


FilterCalc F() 




SpeechQVarO 




FilterBank16() 




SpeechQSpecO 




SpeechQMelO 




DoGainFact F1() 






Log 2() 


DoGainFact F2() 






Log 2() 


DoMellDCT F16() 




ApplyWFO 




Get16k dec1() 




Get16k dec2() 




Get16k decSO 




DoSigWindowing16 F3() 




ff4NRFix32 B{) 






GetL150 




GetH15() 




Mult16x32() 




Add Mult16x16 160 




Sub Mult16x16 160 




PermutO 


FFTtoPSD F() 






Square24d2 B() 




Square24 B() 


DoMelFB B() 




CodeBands16k B() 




DoSpecSub16k B{) 






Log 2() 


UpDateDecalO 




ApplyDecalO 




DCOffsetFil F{) 




Get16k hpBandsSize B() 
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I DoWaveProc_B() 



I DoCompCeps_B() 



Get1 6k_p_hpBands_B() 



Get1 6k_p_bufferCocleForBands1 6k_B() 
Get1 6k_p_CocleForBands1 6k_B() 



Get1 6k_p_bufferGodeWeights_B() 



Get1 6k_p_codeWeights_B() 



Set1 6k_hpBands_dec_B() 



TeagerEngQ 



GetTeagerFilterQ 



CepsComputeQ 



DoPostProc_B() 



DoVADProc_F() 



focalpointQ 



GetMaximaPositionsQ 



Get1 6k_p_bufferCodeWeights_B() 



Get1 6k_p_bufferGodeForBands1 6k_B() 
PreEmphHammQ 



ff4NB16_B() 



GetBandsForPecodingl 6k_B() 



DecodeBandsl 6k_B() 



FilterBankQ 



Get1 6k_hpBands_dec_B() 



Get1 6k_p_hpBands_B() 



MergeSSandGoded_B() 



CorrectEnergyBQ 



Coslnv16Khz() 



coslnvQ (only for 8kHz) 



Table 2: VQ call structure 



main(} 



quantize_and_print() 



get_best_dataframe() 



quant_pitch_abs() 



get_class_bit() 



quant_pitch_diff() 



get_class_bit() 



mfcc_crc_encode() 



pc_crc_encode() 



best_centroid(} 
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Table 3: Extension call structure 

|main() 



RVC_ConstructPitchRom_be() 



RVC_ConstructPitchMeter_be() 



RVC_DestructPitchRom_be() 



RVC_DestructPitchMeter_be() 



I DoAdvProcess_B() 



Allocate_lnterpolated 
Dft_be() 



RVC_ResetPitchMete 

r_be() 



Deallocate_lnterpolat 
edDft_be() 



DoPitchExtractQ 



FilterBankQ 



dsr_afe_vad() 



get_vm() 



IsLowBandNoiseQ 



get_zcm() 



pre_process() 



I RVC_MeasurePitch_beO~ 



fnLo92() 



iir_d() 



iir-S() 



ClearPitch_be() 



Dirichletlnterpolationb 
eO 



lsLowLevellnput_be() 
Finalize_be() 



IsContinuousPitc 
h_be() 



MpyJwswQ 



PrepareSpectralPeaks_ 
beO 



Mpy_lw_sw() I 



CalcSpectrumb 
eO 



FindPeaks_be() 
PrelimScaleDow 
nAmpsOfHighFre 
qPeak5_be(} 



q50rt_be()* 



ComparelpointA 

mpbeQ 



RefineSpectraIPe 
aks_be() 



FindPitchCandidates_b 



FinaLScaleDown 
AmpsOfHighFreq 
Peak5_be() 



MpyJwswQ 



MpyJwswQ 



Mpy lw_sw_Add( 
) 



swapQ 



sqrtlJixQ 



NormalizeAmplitu 
de5_be() 



CalcUtilityFunctio 
n_be() 



CreatePieceWise 
ConstantFunction 
_be() 



qsort_be()' 



Compare_ARRA 
Y_OF_XPOINTS 
_be() 



LinkArrayOfPoint 
s_be() 



AddSortedArrayO 
fPoints_be() 



FindDominantLoc 
alMaximalnUtility 
FunctionbeQ 



UtilityFunctionAt 
GivenPitchFreq_ 
beO 



qsort_be()' 



ComparePitchFre 
qAscendingbeQ 



SelectTopPitchC 
andidates_be() 



compute_pcorr_b 
eO 



ConvertLinkedLis 
tOfDiffPointsToUt 
ilFunc_be() 



L_Extract() 



Mpy_32_1 6() 



swapQ 



LinkArrayOfPoint 
s_be() 



Mpy_lw_sw() I 



swapQ 



Mpy_lw_sw() I 
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|interpolate_be() 



sqrtj_fix() 



find_most_energ 
eticwindowbeQ 



accumulate_be() 



|SelectFinalPitch_be() 



qsort_be()' 



find_most_energ 
etic_window2_be 





MpyJwswQ 



MpyJwswQ 



MpyJwJwQ 



ComparePitchFre 
qDescending_be( 

) 



ClearPitch_be() 



G0OD_ENOUG 
H_be() 



CLOSELY_LOCA 

TED_be() 



BETTER_be() 



IsContinuousPitc 
h_be() 



I clas5ify_frame() 



CalculateDoubleWindo 
wDft_be() 



swapQ 



MpyJwswQ 



MpyJwswQ 



* qsort_be() is a recursive function 
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4.5 Variables, constants and tables 

The data types of variables and tables used in the fixed point implementation are signed integers in 2's complement 
representation, defined by: 

- Wordl6 16 bit variable; 

- Word32 32 bit variable. 
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4.5.1 Description of constants used in tine C-code 



Table 5a: Global constants for AFE 



Constant 


Value 


Description 


NS SPEC ORDER 16K 


64 


Noise suppression Array length 


NS HANGOVER 16K 


15 


Noise suppression hangover count 


NS MIN SPEECH FRAME HANGOVER 16K 


4 


Noise suppression minmum speech frame hangover count 


NS ANALYSIS WINDOW 16K 


80 


Noise suppression analysis window 


PERC CODED 


0.7 


lambda merge (empirically set constant) 


LAMBDA NSE16k 


0.99 


Noise estimation Lambda 


NS NB FRAME THRESHOLD NSE 


100 


Noise suppression number of frame threshold used for NSE 


LENGTH QMF 


118 


OMF filter length 


f24 


1 


multiplier for QMF filter coefficients 


SHFF H 


8 


shift to get higher value 


L H 


16 


shift to get lower value 


HP16k MEL USED 


3 


Higher frequnecy band Mel used 


NB LP BANDS CODING 


3 


Lower frequency band used in coding 


NE16k FRAMES THRESH 


100 


Noise estimation frames threshold 


NB TOPOSTPROC 


12 


Number of coefficients to postprocess 


CEP FRAME LENGTH 


200 


Frame length for cepstral coefficients 


CEP NB COEF 


13 


Number of cepstral coefficients (including cO) 


CEP NB CHANNELS 


23 


Number of filters used for cepstral coefficients 


CEP FFT LENGTH 


256 


FFT length for cepstral coefficients 


FRAME BUF SIZE 


241 


Denoised Output buffer size 


FRAME SHIFT 


80 


WaveProcessing input frame shift 


FRAME LENGTH 


200 


WaveProcessing frame size 


NS SPEC ORDER 


65 


Noise suppression array length (8khz) 


NS BUFFER SIZE 


180 


Noise suppression past frame size 


NS FRAME SHIFT 


80 


Noise suppression input frame shift 


NS HALF FILTER LENGTH 


8 


Noise suppression filter half size 


NS NB FRAME THRESHOLD LTE 


10 


Noise suppression long term energy forgetting factor threshold (in frames) 


NS NB FRAME THRESHOLD NSE 


100 


Noise suppression spectrum estimate forgetting factor threshold (in frames) 


NS MIN FRAME 


10 


Number of frame threshold to update average energy for Nosie suppression VAD 


NS FFT LENGTH 


256 


FFT length for noise suppression 


WF MEL ORDER 


25 


Noise suppression Wiener filter order 


SHFT NOISE 


14 


shift applied to noise spectrum estimate 


SHFT FACT MUL 


14 


shift applied to gain coefficient (nosie suppression gain factoriization) 


IDCT ORDER 


25 


Noise suppression idct order 


NS BETA 


0.98 


Noiseless signal suppression factor 


NS RSB MIN 


0.079432823 


Minimum a priori SNR 


NS LAMBDA NSE 


0.99 


Forgetting factor for noise spectrum estimate 


NS LOG SPEC FLOOR 


-10.0 


average energy minimum threshold 


NS SNR THRESHOLD VAD 


15 


SNR threshold for noise suppression VAD 


NS SNR THRESHOLD UPD LTE 


20 


Long term energy update threshold for noise suppression VAD 


NS ENERGY FLOOR 


80 


Energy Minimum threshold for noise suppression VAD 


MaxPos 


10 


Maximum number of maxima in waveprocessing 


WP EPS 


0.2 


weigthing value added or substracted for waveprocessing 



Table 5b: Global constants for VQ 



Constant 


Value 


Description 


MIN PERIOD 


1245184 


Minimum pitch period allowed 


MAX PERIOD 


9175040 


Maximum pitch period allowed 


NUM MULTI LEVELS 1 


26 


number of levels in pitch quantization 


NUM MULTI LEVELS 2 


24 


number of levels in pitch quantization 


UNVOICED CODE 





init value for Qpindex 



Table 5c: Global constants for Extension 



Constant 


Value 


Description 


HISTORY LEN 


100 


History length - past samples for pitch extraction 


DOWN SAMP FACTOR 


4 


Down-sampling factor - used in computing correlation 


NO OF DFT POINTS 


128 


Number of DFT points 


BREAK POINT 


12 


Break point - marks the end of low frequency band 


LBN HIST WEIGHT 


32440 


Low band noise history weight 


LBN CURR WEIGHT 


328 


Low band noise current weight (32768 - LBN HIST WEIGHT) 


LBN MAX THR 


124518 


Low band noise maximum threshold 


LBN LOW ENR LEVEL MANT 


32000 


Low band noise low energy level mantissa 


LBN LOW ENR LEVEL SHFT 


22 


Low band noise low energy level shift 


RVC OK 





Return code for success 


RVC ERR 


-1 


Return code for unspecified error 


RVC ERR NOT ENOUGH MEMORY 


-2 


Return code for not enough memory 


RVC ERR ILLEGAL ARGUMENT 


-3 


Return code for an illegal input / output argument 


RVC ERR 10 FAILED 


-4 


Return code for failed input / output to a file 


RVC ERR BAD FILE FORMAT 


-5 


Return code for a bad file header 


RVC ERR NOT INITIALIZED 


-6 


Return code for failure due to improper initialization 


RVC ERR ILLEGAL USAGE 


-7 


Return code for illegal usage of a function 


RVC ERR NOT ENOUGH SAMPLES 


-8 


Return code for insufficient number of samples 


RVC ERR NOT IMPLEMENTED 


-9 


Return code for an unimplemented function 
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RVC ERR FAIL OPEN FILE 


-10 


Return code for failure to open a file 


UB ENRG FRAC 


59 


Upper band energy fraction 


ZCM THLD 


87 


Zero crossing measure tfiresfiold 


SORT ONE HALF 


Ox5A82 


Square root of 0.5 (0.707) 


FRAME LEN DS 


50 


Frame lengthi downsampled (200/4} 


FRAME LEN DS BY 2 


25 


Frame length) downsampled divided by 2 


HISTORY LEN DS 


25 


History length downsampled (100/4) 


WINDOW LENGTH 


18 


Window lengtti used in computing correlation 


INV WINDOW LENGTH 


1820 


Inverse of window lengtti (1/18 = 0.05556) 


NUM CHAN 


23 


Number of ctiannels or Mel-frequency bands 


MIN CH ENRG MANTISSA 


20000 


Minimum channel energy mantissa 


MIN OH ENRG SHIFT 


25 


Minimum channel energy shift 


INIT SIG ENRG MANTISSA 


30518 


Initial signal energy mantissa 


INIT SIG ENRG SHIFT 


8 


Initial signal energy shift 


CE SM FAG 


18022 


Channel energy smoothing factor 


CE SM FAG COMPL 


14746 


Channel energy smoothing factor complement 


ONE SM FAC 


3277 


Channel noise energy smoothing factor 


ONE SM FAC COMPL 


29491 


Channel noise energy smoothing factor complement 


LO GAMMA 


22938 


Low gamma value 


LO GAMMA COMPL 


9830 


Low gamma value complement 


HI GAMMA 


29491 


High gamma value 


HI GAMMA COMPL 


3277 


High gamma value complement 


LO BETA 


31130 


Low beta value 


HI BETA 


32702 


High beta value 


INIT FRAMES 


10 


Initial number of frames (considered to be noise frames) 


SINE START CHAN 


4 


Sine start channel (for sine wave detection) 


PEAK TO AVE THLD 


10 


Peak to average threshold 


DEV THLD 


1523942 


Deviation threshold 


HYSTER CNT THLD 


9 


Hysteresis count threshold 


F UPDATE CNT THLD 


500 


Forced update count threshold 


NON SPEECH THLD 


32 


Non-speech threshold 


FIX 34 


24576 


(short) (32768.0 * 3.0/4.0) 


FIX 18 


4096 


(short) (32768.0 * 1 .0/8.0) 


FIX INVS0RT2 


-23170 


1 / sqrt(2) 


swTHIRD REF BANDWIDTH 


85 


One third of the reference bandwidth 


swTWO THIRDS REF BANDWIDTH 


171 


Two thirds of the reference bandwidth 


MIN ENERGY MANTISSA 


25600 


Minimum energy mantissa 


MIN ENERGY SHIFT 


18 


Minimum energy shift 


swREF SAMPLE RATE QO 


0x1 F40 


Reference sampling rate in QO format 


swCLOSE FACTOR 014 


0x4CCD 


Closeness factor in 014 format 


swFD SCORE THLD1 015 


0x63D7 


Frequency domain score threshold 1 in 01 5 format 


swFD SCORE THLD2 015 


0x570A 


Frequency domain score threshold 2 in 01 5 format 


swCORR THLD 015 


0x651 F 


Correlation threshold in 015 format 


swSUM THLD 014 


0x6667 


Sum threshold in 014 format 


IwCRITO OFFSET 015 


0x00001 70A 


Offset for finding a better pitch candidate in 01 5 format 


swCANDCORR THLD1 015 


0x799A 


Pitch candidate correlation threshold 1 in 01 5 format 


swCANDCORR THLD2 015 


0x599A 


Pitch candidate correlation threshold 2 in 01 5 format 


swCANDCORR THLD3 015 


0x6CCD 


Pitch candidate correlation threshold 3 in 015 format 


swCANDAMP THLD3 015 


0x68F6 


Pitch candidate amplitude threshold 3 in 01 5 format 


swSTARTFREQ COEFF 


0x553F 


Start frequency coefficient (for candidate search) 


swENDFREO COEFF 


0x4666 


End frequency coefficient (for candidate search) 


DIRICHLET KERNEL SPAN 


8 


Direchlet kernal span (for interpolation) 


REF SAMPLE RATE 


8000 


Reference sampling rate 


REF BANDWIDTH 


4000 


Reference bandwidth 


IwTHIRD REF BANDWIDTH 


87381333 


One third of the reference bandwidth 


IwTWO THIRDS REF BANDWIDTH 


174762667 


Two thirds of the reference bandwidth 


swCENTER WEIGHT 


0x5000 


Center weight 


swSIDE WEIGHT 


0x1800 


Side weight 


swAMP SCALE D0WN1 


0x5333 


Amplitude scale down factor 1 


swAMP SCALE D0WN2 


0x399A 


Amplitude scale down factor 2 


swAMP SCALE D0WN2b 


0x7333 


Amplitude scale down factor 2b 


swUDISTI 


-4160 


Utility function distance 1 


SWUDIST2 


-6400 


Utility function distance 2 


swUSTEP 


-16384 


Utility function step 


swFREO MARGIN1 


Ox4AE1 


Frequency margin 1 


swAMP MARGIN1 


Ox07AE 


Amplitude margin 1 


swAMP MARGIN2 


Ox07AE 


Amplitude margin 2 


MIN STABLE FRAMES 


6 


Minimum number of stable frames 


MAX TRACK GAP FRAMES 


2 


Maximum pitch track gap frames 


swSTABLE FREO UPPER MARGIN 


Ox4E14 


Stable frequency upper margin 


swSTABLE FREO LOWER MARGIN 


0x68EB 


Stable frequency lower margin 


UNVOICED 





Pitch frequency of an unvoiced frame 


IwMAX PITCH FREO 


Ox01A40000L 


Maximum pitch frequency 


IwMIN PITCH FREO 


0x00340000L 


Minimum pitch frequency 


MAX PITCH FREO 


420 


Maximum pitch frequency in Hz 


MIN PITCH FREO 


52 


Minimum pitch frequency in Hz 


HIGHPASS CUTOFF FREO 


300 


Highpass cut-off frequency in Hz 


NO OF FRACS 


77 


Number of fractions in the frations table 


IwSHORT WIN START FREO 


0X00C80000L 


Short window start frequency 


IwSHORT WIN END FREO 


Ox01A40000 


Short window end frequency 


IwSINGLE WIN START FREO 


0x00640000L 


Single window start frequency 


IwSINGLE WIN END FREO 


0x00D20000L 


Single window end frequency 


IwDOUBLE WIN START FREO 


0x00340000 


Double window start frequency 


IwDOUBLE WIN END FREO 


0X00780000L 


Double window end frequency 


MAX LOCAL MAXIMA ON SPECTRUM 


70 


Maximum number of local maxima on the spectrum 


MAX PEAKS FOR SORT 


30 


Maximum number peaks for sorting 


MAX PEAKS PRELIM 


7 


Maximum number of peaks (preliminary) 


MIN PEAKS 


7 


Minimum number of peaks 


MAX PEAKS FINAL 


20 


Maximum number of peaks (final) 


MAX PRELIM CANDS 


4 


Maximum number of preliminary candidates (pitch) 


CREATE PIECEWISE FUNC LOOP LIM SH 


20 


Create Piecewise function loop limit for short window 


CREATE PIECEWISE FUNC LOOP LIM SNG 


30 


Create Piecewise function loop limit for single window 


CREATE PIECEWISE FUNC LOOP LIM DBL 


60 


Create Piecewise function loop limit for double window 
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SwSUM FRACTION 


0x799A 


Sum fraction 


swAMP FRACTION 


0x33F8 


Amplitude fraction 


MAX BEST CANDS 


2 


Maximum number of best candidates (pitcti) 


N OF BEST CANDS SHORT 


2 


Number of best candidates for stiort window 


N OF BEST CANDS SINGLE 


2 


Number of best candidates for single window 


N OF BEST CANDS DOUBLE 


2 


Number of best candidates for double window 


N OF BEST CANDS 


6 


Number of best candidates for all windows 


SIZE_SCRATCH_DOPITCH 


1090 


Scratch memory size for DoPitcti() function (Tills is the actual size required. The 
declared size in C simulation is 1632) 


SIZE_SCRATCH_ADVPROCESS 


825 


Scratch memory size for DoAdvProcessO function (This is the actual size required. 
The declared size in C simulation is 1 1 00) 


RVC PITCH ROM SIG 


11031 


Signature for RVC PITCH ROM structure 


RVC PITCH METER SIG 


21053 


Signature for RVC PITCH METER structure 
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4.5.2 Description of fixed tables used in tine C-code 

This section contains a listing of all fixed tables sorted by source file name and table name. All table data is declared as 
Wordl6. 

Table 6a: Fixed tables for AFE 



File 


Table Name 


Length 


Description 


16kHzProcessing B.c 


table pow2 


33 


Table for square root 




LambdaNSEx2 


100 


Table used to compute first 1 00 LambdaNSE 




dp02 h 


59 


MSB of QMF filter coefficients 




dp02 1 


43 


LSB of QMF filter coefficients 


PostProc B.c 


targetLMS16 


12 


Target for blind equalization 


ComCeps B.c 


HalfHammingie 


100 


Hamming window coefficients 




CosMatrix16 


144 


Inverse cosinus coefficients at 8Khz (not used at 16khz} 




CosMatrix16 16khz 


156 


Inverse cosinus coefficients at 1 BKhz 




pondMelFilter 


309 


Mel bank coefficients 


ff4nrFix16 B.c 


tabSin 


64 


Sine table 




tabCos 


64 


Cosine table 


MathFunc.c 


tbIntO 


48 


Coefficients for computation of square root 


ExtNoiseSup B.c 


lambda 1divX 


20 


Computation of 1/N 




Hann sh32 hi 


100 


MSB of hanning window coefficients (32 bits) 




Hann sh32 lo 


100 


LSB of hanning window coefficients (32 bits) 




Hann sh24 hi 


100 


MSB of hanning window coefficients (24 bits) 




Hann sh24 lo 


100 


LSB of hanning window coefficients (24 bits) 




pondI\/leiFilterNoise 


157 


Mel-frequency scale coefficients (applied to ttie Wiener filter) 




idctMel16 


234 


Mel-warped inverse DCT coefficients 




pondlVlelFilter16l< 


134 


Filter bank coefficients at 1 6Khz 




IVI1 LamdaLTE 


8 


Computation of 1/N 




IVI1 LambdaNSEx2 


100 


Computation of 2/N 




IVI1 LamdaNSE 


9 


Computation of 1/N 




minvLambdaie 


10 


Comutation od 2/N 



Table 6b: Fixed tables for VQ 



File 


Table Name 


Length 


Description 


coder VAD.c 


quantizer16kHz 1 


128 


vq table 




quantizeriekHz 2 3 


128 


vq table 




quantizer16kHz 4 5 


128 


vq table 




quantizeriekHz 6 7 


128 


vq table 




quantizeriekHz 8 9 


128 


vq table 




quantizeriekHz 10 11 


64 


vq table 




quantizeriekHz 12 13 


512 


vq table 




quantizer8kHz 1 


128 


vq table 




quantizer8kHz 2 3 


128 


vq table 




quantizer8kHz 4 5 


128 


vq table 




quantizer8kHz 6 7 


128 


vq table 




quantizer8kHz 8 9 


128 


vq table 




quantizer8kHz 10 11 


64 


vq table 




quantizer8kHz 12 13 


512 


vq table 




weightlBkHz cO shift 




vq weights 




weightiekHz cO norm 




vq weights 




weightiekHz logE 




vq weights 




weightSkHz cO shift 




vq weights 




weightSkHz cO norm 




vq weights 




weightSkHz logE 




vq weights 




plwQuantLevels[1 27] 


127*2 


vq tables for pitch/class quantization 




ppplwQuantSections[81[3] 


24*2 


vq tables for pitch/class quantization 




plwQuantLevels[31] 


31*2 


vq tables for pitch/class quantization 




pplwQuantSections[4][3] 


12*2 


vq tables for pitch/class quantization 




pswRatioThId 1[4][6I 


24 


vq tables for pitch/class quantization 




piMultiLevellndex[4] 


4 


vq tables for pitch/class quantization 




pswRatioThId 2[4][8] 


32 


vq tables for pitch/class quantization 




piMultiLevellndex 2[4] 


4 


vq tables for pitch/class quantization 




swAlphal 


1 


pitch/class constants 




swAlpha2 


1 


pitch/class constants 
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Table 6c: Fixed Tables for Extension 



File 


Table name 


Length 


Description 


ExtNoiseSup B.c 


pswPePower 


129 


Coefficients to compute the pre-emphasis power spectrum 


preProc B.c 


pswHpfCoef 


15 


High pass filter coefficients 


preProc B.c 


pswLpfCoef 


15 


Low pass filter coefficients 


preProc B.c 


pswLfeCoef 


3 


Low frequency emphasis filter coefficients 


dsrAfeVad B.c 


piBurstConst 


20 


Burst length constants for different SNR's 


dsrAfeVad B.c 


piHangConst 


20 


Hang length constants for different SNR's 


dsrAfeVad B.c 


piVADThId 


20 


VAD voice metric thresholds for different SNR's 


dsrAfeVad B.c 


piVIVITable 


90 


Voice metric table as a function of SNR index 


dsrAfeVad B.c 


piSigThId 


20 


Signal threshold table as a function of SNR 


dsrAfeVad B.c 


piUpdateThId 


20 


Update threshold table as a function of SNR 


dsrAfeVad B.c 


pswShapeTable 


23 


Spectral shape correction table 


fix matfilib.c 


coeff sqrtS 58 


5 


Coefficients for computation of square root 


fix matfilib.c 


coeff sqrtS 78 


5 


Coefficients for computation of square root 


rvc pitch init B.ti 


ROIVI astPrac 


312 


Fractions table 


rvc pitch init B.h 


ROIVI pstWindowshiftTable 


514 


Complex exponents table for time shifting in frequency domain 


rvc_pitch init B.h 


ROIVI aswDirichletlmaq 


8 


Imaginary part of the Dirichlet kernel 



4.5.3 Static variables used in tlie C-code 

In this section two tables that specify the static variables for the AFE, VQ, and Extension respectively are shown. 
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Table 7a: AFE static variables 



Struct Name 


Variable 


Type[Length] 


Description 


QMF FIR 










lengthQMF 


Word32 


OMF Filter length 




•dp 1 


Word 16 


OMF filter low frequency Coeff 




*dp h 


Word 16 


OMF filter high frequency Coeff 




*T 


Word 16 


Temporary OMF filter buffer 




T dec 


Word 16 


Multiplier for T 


DataFor16kProc B 










FrameLength 


Word32 


Input Frame length 




FrameShift 


Word32 


Shift value for the frame 




numFrameslnBuffer 


Word32 


Number of frames in buffer 




Sampling Frequency 


Word32 


Sampling frequency (8/16) 




Do16kHzProc 


BOOLEAN 


Flag to enable 16kHz processing 




"hpBands B 


Word32 


Bufferfor HP bands 




hpBandsSize 


Word32 


hpBands B buffer size 




CodeForBandslSk B 


Word32[9] 


HP coding buffer 




bufferCodeForBands16k B 


Word32[27] 


buffer used for HP coding 




codeWeights B 


Word16[3] 


code Weights buffer 




bufferCodeWeights B 


Word16[9] 


buffer used for code Weights 




* pQMF Fir 


OMF FIR 


Pointer to QMF FIR structure 




*bufferData16k B 


Word32 


temporary buffer to carry OMF LP data 




bufDatal SkSize 


Word32 


1 6k data buffer size 




*FirstWindow16k 


MelFB Window 


pointer to MelFB Window structure 




noiseSEISk B 


Word32[3] 


noise spectrul energy variable 




noise dec 


Word 16 


Multiplier for noiseSEI 6k B 




BandsForCoding16k B 


Word32[9] 


buffer for storing Bands for Coding 




vadCounter16k 


Word32 


vad flag counter 




vad16k 


Word32 


vad flag 




nbSpeechFramesI 6k 


Word32 


number of speech frames counter 




hangOverl 6k 


Word32 


hang over used for VAD 




meanEnlBk 


Word32 


mean Energy variable 




nb frame threshold nse 


Word32 


threshold NSE for frame 




lambda nse 


Word 16 


lambda NSE variable 




"dataHP B 


Word32 


buffer stores QMF HP value 




dec 16k 


Word16[5] 


Multiplier for dataHP B buffer 




BFC dec 


Word16[1] 


Multiplier for computing bands for coding 




fb16k dec 


Word16[3] 


Buffer is used to store multiplier for current and pervious two frames 


PostProcStructX 










weightLMS 


Word32[121 


Current LMS weight 


CompCepsStructX 










FFTLength 


Word32 


FFT size 




Do16khzProc 


Word 16 


Flag to enable 16kHz processing 




*pData16k 


Word32 


Pointer to data for 1 6Khz processing 


WaveProcStructX 










*TeagerFilter16 


Word32 


Pointer to teager filter 




*TeagerWindow32 


Word32 


Pointer to teager window 




TeagerOnset 


Word32 


Unused 




FrameLength 


Word32 


Input frame length 


ns var F 










SampFreq 


Word 16 


Sampling frequency (8/16) 




Do16khzProc 


Word 16 


Flag to enable 16kHz processing 




buffers. nbFrameslnFirstStage 


Word32 


number of frames in first stage 




buffers. nbFrameslnFirstStage 


Word32 


number of frames in second stage 




buffers. nbFramesOutSecondStage 


Word32 


number of frames out og second stage 




buffers. FirstStageln16Buffer 


Word16[1801 


First stage buffer 




buffers.SecondStagelnBuffer32 


Word32[1801 


Second stage buffer 




buffers. SecondDecalSig 


Word16[4] 


Shift factor for each sub-frame of second stage buffer 




prevSamples32.lastSampleln32 


Word32 


Last input sample of DC offset compensation 




prevSamples32.lastDCOut32 


Word32 


last output sample of DC offset compensation 




prevSamples32. oldShift 


Word 16 


Iprevious window shift factor of DC offset compensation 




spectrum. indexBufferl 


Word 16 


Where to enter new PSD for first stage, alternatively and 1 




spectrum. indexBuffer2 


Word 16 


Where to enter new PSD for second stage, alternatively and 1 




spectrum. noiseSEI 32 


Word32[651 


Noise spectrum estimate for first stage 




spectrum. noiseSEI dec 


Word16[651 


Shift factor for Noise spectrum estimate (first sage) 




spectrum. noiseSE2 32 


Word32[651 


Noise spectrum estimate for second stage 




spectrum. noiseSE2 dec 


Word16[65] 


Shift factor for Noise spectrum estimate (second sage) 




spectrum.PSDIVIeanAntBufferl 


Word32[65] 


1 St stage PSD Mean buffer for precedent frame 




spectrum. nSigSEI Ant dec 


Word16[65] 


Shift factor for PSD Mean buffer for precedent frame (1 rst stage) 




spectrum.PSDIVIeanAntBuffer2 


Word32[65] 


2nd stage PSD Mean bufferfor precedent frame 




spectrum. nSigSE2Ant dec 


Word16[651 


Shift factor for PSD Mean buffer for precedent frame (2nd stage) 




spectrum. denSigSEI 32 


Word32[651 


1 St stage PSD Mean buffer 




spectrum. nSigSEI Cur dec 


Word16[651 


Shift factor for PSD Mean buffer (1 rst stage) 




spectrum. denSigSE2 32 


Word32[651 


2nd stage PSD Mean buffer 




spectrum. nSigSE2Cur dec 


Word16[65] 


Shift factor for PSD Mean buffer (2"" stage) 




vad data ns F. nbFrame 


Word16[2] 


Nubmer of frames (for the 2 stages) 




vad data ns F. flagVAD 


Word 16 


Vad Flag (1 = SPEECH, = NQN SPEECH) 




vad data ns F.hangOver 


Word 16 


hangover 




vad data ns F. nbSpeechFrames 


Word 16 


Number of speech frames (used to set hangover) 




vad data ns F.meanEn32 


Word32 


Mean energy for VAD 




vad data ca. flagVAD 


Word16 


Vad Flag (1 = SPEECH, = NQN SPEECH) 




vad data ca.hangOver 


Word 16 


hangover 




vad data ca. nbSpeechFrames 


Word 16 


Number of speech frames (used to set hangover) 




vad data ca.meanEn32 


Word32 


Mean energy for VAD 




vad data fd.IVIelMean 


Word 16 


SpeechOMel (for frame dropping) 




vad data fd.VarMean 


Word32 


SpeechOVar (for frame dropping) 
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vad data fd.AccTest 


Word32 


SpeechQSpec (for frame dropping) 




vad data fd.AccTest2 


Word32 






vad data fd.SpecMean 


Word32 


SpecMean (for frame dropping} 




vad data fd.MelValues 


Word16[2] 


SpeechQMei (for frame dropping) 




vad data fd.SpecValues 


Word32 


SpeechQSpec (for frame dropping) 




vad data fd.SpeechlnVADQ 


Word 16 


Flag (for frame dropping) 




vad data fd.SpeechlnVADQ2 


Word 16 


Flag (for frame dropping) 




gainFact.logDenEnI 32 


Word32[3] 


Denoise frame energy for gain factorization 




gainFact.lowSNRtrack32 


Word32 


Low SNR level for gain factorization 




gainPact. alfaGF16 


Word 16 


Wiener filter gain factorization coefficient 


VADStructX F 










Focus 


Word 16 


Position of circular buffe 




HangOver 


Word 16 


Hangover length 




FlushFocus 


Word 16 


Position in circular buffer when emptying at end 




H CountDown 


Word 16 


Main hangover countdown 




V CountDown 


Word 16 


Short hangover countdown 




**OutBuffer 


Word32 


OutBuffer pointer pointer 




'OutBuffer 


Word32[7] 


OutBuffer pointer 




OutBuffer 


Word16[7x15] 


OutBuffer 



Table 7b: VQ static variables 



Struct Name 


Variable 


Type [Length] 


Description 


coder VAD.c 


four frames[27] 


Word16[27] 


Previous frames used to build multiframe 




plwQPHistory[3] 


Word32[3] 


History of Pitcti 




IReliableFlag 


Word 16 


Pitcfi reliability flag 
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Table 7c: Extension static variables 



Struct Name 


Variable 


Type[Length] 


Description 




iFirstFrameFlag 


Word16 


First frame flag 




pswUBSpeech 


Word16[200] 


Upper band speecti 




pswDownSampledProcSpeech 


Word16[75] 


Down-sampled processed speech 




IwCritMax 


Word32 


I\/laximum power ratio 




iOldPitchPeriod 


Word16 


Old pitch period value 




iOldFrameNo 


Word 16 


Old frame number 


PCORR STATE be 


s be 








lwX1 X1 


Word32 


xrxi 




lwZ1 Z1 


Word32 


zrzi 




lwZ2 Z2 


Word32 


Z2*Z2 




lwX1 Z1 


Word32 


xrzi 




lwX1 Z2 


Word32 


X1*Z2 




lwZ1 Z2 


Word32 


zrz2 




swXI Sum 


Word16 


Sum of XI 




swZI Sum 


Word16 


SumofZI 




swZ2 Sum 


Word16 


SumofZ2 




iBurstConst 


Word16 


Burst constant 




iBurstCount 


Word16 


Burst count 




iHangConst 


Word16 


Hang constant 




iHangCount 


Word16 


Hang count 




iVADThId 


Word 16 


VAD threshold 




iFrameCount 


Wordia 


Frame count 




iFUpdateFlag 


Wordie 


Forced update flag 




iHysterCount 


Wordia 


Hysteresis count 




iLastUpdateCount 


Wordia 


Last update count 




iSigThId 


Wordia 


Signal threshold 




iUpdateCount 


Wordia 


Update count 




iChanEnrg Shift 


Wordia 


Channel energy shift 




iChanNoiseEnrgShift 


Wordia 


Channel noise energy shift 




pswChanEnrg 


Wordia[23] 


Channel energy 




pswChanNoiseEnrg 


Wordia[23] 


Channel noise energy 




swBeta 


Wordia 


Beta value 




swSnr 


Wordia 


SNR value 


NormSw 


pnsLogSpecEnrgLong 








swMantissa 


Wordia[23] 


Mantissa 




iShift 


Wordia[23] 


Shift 




swCO 


Wordia 


CO value 




swC1 


Wordia 


CI value 




swC2 


Wordia 


C2 value 




pswHpfXState 


wordia[ai 


High pass filter input state 




pswHpfYState 


Wordia[12] 


High pass filter output state 




pswLpfXState 


wordiaia] 


Low pass filter input state 




pswLpfYState 


Wordia[12] 


Low pass filter output state 




pswLfeXState 


Wordia 


Low frequency emphasis filter input state 




pswLfeYState 


Wordia[2] 


Low frequency emphasis filter output state 
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5 File formats 

This section describes the file formats used by the APE, VQ & Extension programs. 



5.1 Speech file 



Speech files read by the X-AEE and written by the Extension consist of 16-bit words. The byte order depends on the 
host architecture (e.g. MSByte first on SUN workstations, LSByte first on PCs etc) 
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Annex A (informative): 
Change history 



Change history 


Date 


TSG# 


TSG Doc. 


CR 


Rev 


Subject/Comment 


Old 


New 


2004-06 


24 


SP-040343 






Version 6.0.0 approved at 3GPP TSG SA#24 


2.0.0 


6.0.0 


2004-12 


26 


SP-040837 


001 


1 


Software bug correction: Removal of Basicops simulation 
of "0" shift operator 


6.0.0 


6.1.0 


2004-12 


26 


SP-040837 


002 


1 


Software bug correction: Initialization of the variables Iwc 
and i2aScale 


6.0.0 


6.1.0 


2004-12 


26 


SP-040837 


003 


1 


Software bug correction: Wrong assignment of the 
variables *piReliableFlag and *pcQPIndex 


6.0.0 


6.1.0 


2004-12 


26 


SP-040837 


004 


2 


Software bug correction: Use of incorrect variable 
fRefPeriod instead of iRefPeriod 


6.0.0 


6.1.0 


2004-12 


26 


SP-040837 


005 




Add reference to test sequences document 


6.0.0 


6.1.0 


2007-06 


26 








Version for Release 7 


6.1.0 


7.0.0 


2008-12 


42 








Version for Release 8 


7.0.0 


8.0.0 
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